Diyarbakir Province
TRAJECT-Bench:A Trajectory-Aware Benchmark for Evaluating Agentic Tool Use
He, Pengfei, Dai, Zhenwei, He, Bing, Liu, Hui, Tang, Xianfeng, Lu, Hanqing, Li, Juanhui, Ding, Jiayuan, Mukherjee, Subhabrata, Wang, Suhang, Xing, Yue, Tang, Jiliang, Dumoulin, Benoit
Large language model (LLM)-based agents increasingly rely on tool use to complete real-world tasks. While existing works evaluate the LLMs' tool use capability, they largely focus on the final answers yet overlook the detailed tool usage trajectory, i.e., whether tools are selected, parameterized, and ordered correctly. We introduce TRAJECT-Bench, a trajectory-aware benchmark to comprehensively evaluate LLMs' tool use capability through diverse tasks with fine-grained evaluation metrics. TRAJECT-Bench pairs high-fidelity, executable tools across practical domains with tasks grounded in production-style APIs, and synthesizes trajectories that vary in breadth (parallel calls) and depth (interdependent chains). Besides final accuracy, TRAJECT-Bench also reports trajectory-level diagnostics, including tool selection and argument correctness, and dependency/order satisfaction. Analyses reveal failure modes such as similar tool confusion and parameter-blind selection, and scaling behavior with tool diversity and trajectory length where the bottleneck of transiting from short to mid-length trajectories is revealed, offering actionable guidance for LLMs' tool use.
- Europe > Austria > Vienna (0.14)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.14)
- Europe > France (0.04)
- (35 more...)
- Leisure & Entertainment (1.00)
- Consumer Products & Services > Travel (1.00)
- Media > Music (0.96)
- (2 more...)
Learning from End User Data with Shuffled Differential Privacy over Kernel Densities
We study a setting of collecting and learning from private data distributed across end users. In the shuffled model of differential privacy, the end users partially protect their data locally before sharing it, and their data is also anonymized during its collection to enhance privacy. This model has recently become a prominent alternative to central DP, which requires full trust in a central data curator, and local DP, where fully local data protection takes a steep toll on downstream accuracy. Our main technical result is a shuffled DP protocol for privately estimating the kernel density function of a distributed dataset, with accuracy essentially matching central DP . We use it to privately learn a classifier from the end user data, by learning a private density function per class. Moreover, we show that the density function itself can recover the semantic content of its class, despite having been learned in the absence of any unprotected data. Our experiments show the favorable downstream performance of our approach, and highlight key downstream considerations and trade-offs in a practical ML deployment of shuffled DP . Collecting statistics on end user data is commonly required in data analytics and machine learning. As it could leak private user information, privacy guarantees need to be incorporated into the data collection pipeline. Differential Privacy (DP) (Dwork et al., 2006) currently serves as the gold standard for privacy in machine learning. Most of its success has been in the central DP model, where a centralized data curator holds the private data of all the users and is charged with protecting their privacy. However, this model does not address how to collect the data from end users in the first place. The local DP model (Kasiviswanathan et al., 2011), where end users protect the privacy of their data locally before sharing it, is often used for private data collection (Erlingsson et al., 2014; Ding et al., 2017; Apple, 2017). However, compared to central DP, local DP often comes at a steep price of degraded accuracy in downstream uses of the collected data. The shuffled DP model (Bittau et al., 2017; Cheu et al., 2019; Erlingsson et al., 2019) has recently emerged as a prominent intermediate alternative. In this model, the users partially protect their data locally, and then entrust a centralized authority--called the "shuffler"--with the single operation of shuffling (or anonymizing) the data from all participating users.
- Asia > Middle East > Republic of Türkiye > Diyarbakir Province > Diyarbakir (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- (5 more...)
Development of Multistage Machine Learning Classifier using Decision Trees and Boosting Algorithms over Darknet Network Traffic
Nair, Anjali Sureshkumar, Nitnaware, Prashant
In recent years, the clandestine nature of darknet activities has presented an escalating challenge to cybersecurity efforts, necessitating sophisticated methods for the detection and classification of network traffic associated with these covert operations. The system addresses the significant challenge of class imbalance within Darknet traffic datasets, where malicious traffic constitutes a minority, hindering effective discrimination between normal and malicious behavior. By leveraging boosting algorithms like AdaBoost and Gradient Boosting coupled with decision trees, this study proposes a robust solution for network traffic classification. Boosting algorithms ensemble learning corrects errors iteratively and assigns higher weights to minority class instances, complemented by the hierarchical structure of decision trees. The additional Feature Selection which is a preprocessing method by utilizing Information Gain metrics, Fisher's Score, and Chi-Square test selection for features is employed. Rigorous experimentation with diverse Darknet traffic datasets validates the efficacy of the proposed multistage classifier, evaluated through various performance metrics such as accuracy, precision, recall, and F1-score, offering a comprehensive solution for accurate detection and classification of Darknet activities.
- Asia > India > Maharashtra > Mumbai (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
- (6 more...)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.89)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Comparative Analysis of XGBoost and Minirocket Algortihms for Human Activity Recognition
Human Activity Recognition (HAR) has been extensively studied, with recent emphasis on the implementation of advanced Machine Learning (ML) and Deep Learning (DL) algorithms for accurate classification. This study investigates the efficacy of two ML algorithms, eXtreme Gradient Boosting (XGBoost) and MiniRocket, in the realm of HAR using data collected from smartphone sensors. The experiments are conducted on a dataset obtained from the UCI repository, comprising accelerometer and gyroscope signals captured from 30 volunteers performing various activities while wearing a smartphone. The dataset undergoes preprocessing, including noise filtering and feature extraction, before being utilized for training and testing the classifiers. Monte Carlo cross-validation is employed to evaluate the models' robustness. The findings reveal that both XGBoost and MiniRocket attain accuracy, F1 score, and AUC values as high as 0.99 in activity classification. XGBoost exhibits a slightly superior performance compared to MiniRocket. Notably, both algorithms surpass the performance of other ML and DL algorithms reported in the literature for HAR tasks. Additionally, the study compares the computational efficiency of the two algorithms, revealing XGBoost's advantage in terms of training time. Furthermore, the performance of MiniRocket, which achieves accuracy and F1 values of 0.94, and an AUC value of 0.96 using raw data and utilizing only one channel from the sensors, highlights the potential of directly leveraging unprocessed signals. It also suggests potential advantages that could be gained by utilizing sensor fusion or channel fusion techniques. Overall, this research sheds light on the effectiveness and computational characteristics of XGBoost and MiniRocket in HAR tasks, providing insights for future studies in activity recognition using smartphone sensor data.
- Asia > Middle East > Republic of Türkiye > Diyarbakir Province > Diyarbakir (0.07)
- North America > United States > California > Orange County > Irvine (0.05)
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Empirical and Experimental Insights into Data Mining Techniques for Crime Prediction: A Comprehensive Survey
This survey paper presents a comprehensive analysis of crime prediction methodologies, exploring the various techniques and technologies utilized in this area. The paper covers the statistical methods, machine learning algorithms, and deep learning techniques employed to analyze crime data, while also examining their effectiveness and limitations. We propose a methodological taxonomy that classifies crime prediction algorithms into specific techniques. This taxonomy is structured into four tiers, including methodology category, methodology sub-category, methodology techniques, and methodology sub-techniques. Empirical and experimental evaluations are provided to rank the different techniques. The empirical evaluation assesses the crime prediction techniques based on four criteria, while the experimental evaluation ranks the algorithms that employ the same sub-technique, the different sub-techniques that employ the same technique, the different techniques that employ the same methodology sub-category, the different methodology sub-categories within the same category, and the different methodology categories. The combination of methodological taxonomy, empirical evaluations, and experimental comparisons allows for a nuanced and comprehensive understanding of crime prediction algorithms, aiding researchers in making informed decisions. Finally, the paper provides a glimpse into the future of crime prediction techniques, highlighting potential advancements and opportunities for further research in this field
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- North America > United States > Maryland > Baltimore (0.14)
- (50 more...)
- Overview (1.00)
- Instructional Material (1.00)
- Research Report > New Finding (0.93)